A Statistical Analysis of the TREC-3 Data

نویسندگان

  • Jean Tague-Sutcliffe
  • James Blustein
چکیده

A statistical analysis of the TREC-3 data shows that performance differences across queries is greater than performance differences across participant runs. Generally, groups of runs which do not differ significantly at large, sometimes accounting for over half the runs. Correlation among the various performance measures is high.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Opinionated Blogs Using Statistical Classifiers and Lexical Features

This paper systematically exploited various lexical features for opinion analysis on blog data using a statistical learning framework. Our experimental results using the TREC Blog track data show that all the features we explored effectively represent opinion expressions, and different classification strategies have a significant impact on opinion classification performance. We also present res...

متن کامل

Content Locality in Time-Ordered Document Collections

Using newswire data sources from the TREC corpus, we show that the distribution of relevant documents with respect to time can be decidely non-uniform. Many TREC topics show timebased clustering of relevant documents. We denote this clustering content locality and provide a simple metric for its measurement in time-ordered document collections. There is a marked positive correlation between con...

متن کامل

Estimating the Number of Relevant Documents in Enormous Collections

In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are eval...

متن کامل

Indian Statistical Institute, Kolkata at TREC 2010: Legal Interactive

Indian Statistical Institute, Kolkata participated in TREC for the first time this year. We participated in TREC Legal Interactive task in two topics namely, Topic 301 and Topic 302. We reduced the size of the corpus by Boolean retrieval using Lemur 4.11 and followed it by a clustering technique. We chose members from each cluster (which we called seeds) for relevance judgement by the TA and as...

متن کامل

Query clustering and IR system detection. Experiments on TREC data

Variability in IR has been little considered as a way to improve system performance. In this paper, we consider linguistic variability of queries as a clue to predict which system will perform better for a particular query. More precisely, we cluster TREC topics with regard to 16 linguistic features. To each cluster is then associated a system that will be used to proceed all the queries belong...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994